A Case for the Global Access to Large Distributed Data Sets Using Data Webs Employing Photonic Data Services

نویسندگان

  • Robert L. Grossman
  • Yunhong Gu
  • David Hanley
  • Xinwei Hong
  • Jorge Levera
  • Marco Mazzucco
  • Dave Lillethun
  • Joe Mambretti
  • Jeremy Weinberger
چکیده

We argue that data webs employing specialized path services, network protocols, and data protocols can be an effective platform to analyze and access millions of distributed Gigabyte (and larger) size data sets. We have built a prototype of such a data web today and demonstrated that it can effectively access, analyze and mine distributed Gigabyte size data sets even over thousands of miles by using specialized network and data protocols. The prototype uses a server which employs the DataSpace Transfer Protocol or DSTP. Our assumption is that WSDL/SOAP/UDDI-based discovery and description services will enable this same infrastructure to scale to millions of such DSTPServers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data webs for earth science data

We describe high performance data webs for earth science data which are designed for interactively analyzing small to moderate size remote data sets, as well as mining distributed data sets. Achieving high performance required developing specialized high performance transport services as well as specialized high performance middleware services for merging multiple data streams. Data webs comple...

متن کامل

Access control in ultra-large-scale systems using a data-centric middleware

  The primary characteristic of an Ultra-Large-Scale (ULS) system is ultra-large size on any related dimension. A ULS system is generally considered as a system-of-systems with heterogeneous nodes and autonomous domains. As the size of a system-of-systems grows, and interoperability demand between sub-systems is increased, achieving more scalable and dynamic access control system becomes an im...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

An Efficient Data Replication Strategy in Large-Scale Data Grid Environments Based on Availability and Popularity

The data grid technology, which uses the scale of the Internet to solve storage limitation for the huge amount of data, has become one of the hot research topics. Recently, data replication strategies have been widely employed in distributed environment to copy frequently accessed data in suitable sites. The primary purposes are shortening distance of file transmission and achieving files from ...

متن کامل

Solubility Prediction of Drugs in Supercritical Carbon Dioxide Using Artificial Neural Network

The descriptors computed by HyperChem® software were employed to represent the solubility of 40 drug molecules in supercritical carbon dioxide using an artificial neural network with the architecture of 15-4-1. The accuracy of the proposed method was evaluated by computing average of absolute error (AE) of calculated and experimental logarithm of solubilities. The AE (±SD) of data sets was 0.4 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003